49 research outputs found

    An efficient task-based all-reduce for machine learning applications

    Get PDF
    All-Reduce is a collective-combine operation frequently utilised in synchronous parameter updates in parallel machine learning algorithms. The performance of this operation - and subsequently of the algorithm itself - is heavily dependent on its implementation, configuration and on the supporting hardware on which it is run. Given the pivotal role of all-reduce, a failure in any of these regards will significantly impact the resulting scientific output. In this research we explore the performance of alternative all-reduce algorithms in data-flow graphs and compare these to the commonly used reduce-broadcast approach. We present an architecture and interface for all-reduce in task-based frameworks, and a parallelization scheme for object-serialization and computation. We present a concrete, novel application of a butterfly all-reduce algorithm on the Apache Spark framework on a high-performance compute cluster, and demonstrate the effectiveness of the new butterfly algorithm with a logarithmic speed-up with respect to the vector length compared with the original reduce-broadcast method - a 9x speed-up is observed for vector lengths in the order of 108. This improvement is comprised of both algorithmic changes (65%) and parallel-processing optimization (35%). The effectiveness of the new butterfly all-reduce is demonstrated using real-world neural network applications with the Spark framework. For the model-update operation we observe significant speed-ups using the new butterfly algorithm compared with the original reduce-broadcast, for both smaller (Cifar and Mnist) and larger (ImageNet) datasets

    A Hybrid Approach to Music Playlist Continuation Based on Playlist-Song Membership

    Full text link
    Automated music playlist continuation is a common task of music recommender systems, that generally consists in providing a fitting extension to a given playlist. Collaborative filtering models, that extract abstract patterns from curated music playlists, tend to provide better playlist continuations than content-based approaches. However, pure collaborative filtering models have at least one of the following limitations: (1) they can only extend playlists profiled at training time; (2) they misrepresent songs that occur in very few playlists. We introduce a novel hybrid playlist continuation model based on what we name "playlist-song membership", that is, whether a given playlist and a given song fit together. The proposed model regards any playlist-song pair exclusively in terms of feature vectors. In light of this information, and after having been trained on a collection of labeled playlist-song pairs, the proposed model decides whether a playlist-song pair fits together or not. Experimental results on two datasets of curated music playlists show that the proposed playlist continuation model compares to a state-of-the-art collaborative filtering model in the ideal situation of extending playlists profiled at training time and where songs occurred frequently in training playlists. In contrast to the collaborative filtering model, and as a result of its general understanding of the playlist-song pairs in terms of feature vectors, the proposed model is additionally able to (1) extend non-profiled playlists and (2) recommend songs that occurred seldom or never in training~playlists

    Distributed deep learning networks among institutions for medical imaging

    Get PDF
    Objective Deep learning has become a promising approach for automated support for clinical diagnosis. When medical data samples are limited, collaboration among multiple institutions is necessary to achieve high algorithm performance. However, sharing patient data often has limitations due to technical, legal, or ethical concerns. In this study, we propose methods of distributing deep learning models as an attractive alternative to sharing patient data. Methods We simulate the distribution of deep learning models across 4 institutions using various training heuristics and compare the results with a deep learning model trained on centrally hosted patient data. The training heuristics investigated include ensembling single institution models, single weight transfer, and cyclical weight transfer. We evaluated these approaches for image classification in 3 independent image collections (retinal fundus photos, mammography, and ImageNet). Results We find that cyclical weight transfer resulted in a performance that was comparable to that of centrally hosted patient data. We also found that there is an improvement in the performance of cyclical weight transfer heuristic with a high frequency of weight transfer. Conclusions We show that distributing deep learning models is an effective alternative to sharing patient data. This finding has implications for any collaborative deep learning study

    Deep learning classification in asteroseismology

    No full text

    Music feature maps with convolutional neural networks for music genre classification

    Get PDF
    International audienceNowadays, deep learning is more and more used for Music Genre Classification: particularly Convolutional Neural Networks (CNN) taking as entry a spectrogram considered as an image on which are sought different types of structure.But, facing the criticism relating to the difficulty in understanding the underlying relationships that neural networks learn in presence of a spectrogram, we propose to use, as entries of a CNN, a small set of eight music features chosen along three main music dimensions: dynamics, timbre and tonality. With CNNs trained in such a way that filter dimensions are interpretable in time and frequency, results show that only eight music features are more efficient than 513 frequency bins of a spectrogram and that late score fusion between systems based on both feature types reaches 91% accuracy on the GTZAN database

    Learning Tree-Structured Detection Cascades for Heterogeneous Networks of Embedded Devices

    No full text
    In this paper, we present a new approach to learning cascaded classifiers for use in computing environments that involve networks of heterogeneous and resource-constrained, low-power embedded compute and sensing nodes. We present a generalization of the classical linear detection cascade to the case of tree-structured cascades where different branches of the tree execute on different physical compute nodes in the network. Different nodes have access to different features, as well as access to potentially different computation and energy resources. We concentrate on the problem of jointly learning the parameters for all of the classifiers in the cascade given a fixed cascade architecture and a known set of costs required to carry out the computation at each node.To accomplish the objective of joint learning of all detectors, we propose a novel approach to combining classifier outputs during training that better matches the hard cascade setting in which the learned system will be deployed. This work is motivated by research in the area of mobile health where energy efficient real time detectors integrating information from multiple wireless on-body sensors and a smart phone are needed for real-time monitoring and delivering just- in-time adaptive interventions. We apply our framework to two activity recognition datasets as well as the problem of cigarette smoking detection from a combination of wrist-worn actigraphy data and respiration chest band data.Comment: arXiv admin note: substantial text overlap with arXiv:1607.0373

    Classifying learner behavior from high frequency touchscreen data using recurrent neural networks

    No full text
    Sensor stream data, particularly those collected at the millisecond of granularity, have been notoriously difficult to leverage classifiable signal out of. Adding to the challenge is the limited domain knowledge that exists at these biological sensor levels of interaction that prohibits a comprehensive manual feature engineering approach to classification of those streams. In this paper, we attempt to enhance the assessment capability of a touchscreen based ratio tutoring system by using Recurrent Neural Networks (RNNs) to predict the strategy being demonstrated by students from their 60hz data streams. We hypothesize that the ability of neural networks to learn representations automatically, instead of relying on human feature engineering, may benefit this classification task. Our RNN and baseline models were trained and cross-validated at several levels on historical data which had been human coded with the task strategy believed to be exhibited by the learner. Our RNN approach to this historically difficult high frequency data classification task moderately advances performance above baselines and we discuss what implication this level of assessment performance has on enabling greater adaptive supports in the tutoring system
    corecore